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Methodologies to test hypotheses about the tail-heaviness 
of an underlying distribution are introduced based on results of 
Rojo (1996) using the limiting behavior of the extreme spac- 
ings. The tests are consistent and have point-wise robust levels 
in the sense of Lehmann (2005) and Lehmann and Loh (1990). 
Simulation results based on these new methodologies indicate 
that the tests exhibit good control of the probability of Type 
I error and have good power properties for finite sample sizes. 
The tests are compared with a test proposed by Bryson (1974) 
and it is seen that, although Bryson's test is competitive with 
the tests proposed here, Bryson's test does not have point-wise 
robust levels. The operating characteristics of the tests are also 
explored when the data is blocked. It turns out that the power 
increases substantially by blocking. The methodology is illus- 
trated by analyzing various data sets. 

1. Introduction. Tail behavior of a probability distribution plays an important role in var- 
ious applications including hydrology, aerospace engineering, meteorology, insurance, and finance. 
Lehmann (1988) proposed a pure-tail ordering in connection with the comparison, in terms of ef- 
ficiency, of location experiments. In classical extreme value theory, the Extremal Types Theorem 
(ETT), (the Three Type Theorem as it is also known), classifies the right tail of a distribution 
according to the asymptotic distribution of the standardized maximum. Thus, the distribution F 
is short-, medium-, or long-tailed depending on whether F is in the domain of attraction of the 
Weibull, Gumbel, or Frechet distributions. It is well-known, however, that the limiting distribution 
for the standardized maximum does not exist for all distributions. For instance, any distribution 
that assigns positive mass to the right endpoint of its support cannot be classified by the ETT. 
Another possible drawback of the classical categorization of probability laws using the ETT is that 
the class of medium-tailed distributions may be too large. For instance, Schuster (1984) argues 
that, "the statistician considers the normal distribution shorter than the exponential which is in 
turn shorter than the lognormal distribution" . Yet all three are in the domain of attraction of the 
Gumbel distribution. Thus, there is a need to classify distributions by alternative schemes. 

Another possible avenue for such classification may be obtained through the tail-heaviness of 
a distribution. There exists a robust literature on orderings that attempt to order distributions 
according to tail-heaviness. Some of the early work allows for the middle part of the distribution to 
affect the tail ordering. See, e.g., Loh (1984), Doksum (1969), and Lehmann (1988). By contrast, 
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in a series of papers, Rojo (1988, 1992, 1993, 1996) proposes pure-tail orderings which allow for 
pure-tail comparisons without many of the technical assumptions required by other approaches. 

Various methods for distinguishing between exponential and power-tails have been proposed 
in the literature and are used in practice. The more popular ones are based on plotting various 
quantities whose behavior depend on the tail-behavior of the underlying distribution. For example, 
Heyde and Kou (2004) discuss methods based on plotting the mean residual life function, Q-Q 
plots, conditional moment generating functions, Hill estimation, and likelihood methods. Heyde 
and Kou (2004) argue that these methods are qualitative without any support for their "statistical 
precision". The Hill estimator is popular in practice but it has its share of problems, including 
its undesirable behavior as represented by the "Hill's horror plots" as illustrated in Embrechts et 
al (1997), in spite of the various results concerning its asymptotic properties; moreover, its use 
should be restricted to the case of power-tails since the Hill estimation may be misleading, as it 
does not provide alerts, when applied to other types of tails. Thus the use of Hill estimation may 
be inappropriate for distinguishing power tails from other classes of tails. Somewhat surprisingly, 
Heyde and Kou (2004) conclude that it may be necessary to have sample sizes in the tens, and 
sometimes in the hundreds, of thousands to be able to differentiate between power and exponential 
tails. The main reason for this is that the large quantiles of exponentially-tailed distributions may 
actually exceed the counterpart quantiles of power-like tails. This characteristic will be observed in 
our simulation work. One way to ameliorate this problem is by blocking the data. 

The purpose of this paper is to develop methodologies to test hypotheses about the tail-heaviness 
of a distribution based on the results of Rojo (1996). The paper will focus on the right tail of the 
distribution, but analogous results are easily seen to hold for the left tail by considering instead 
the behavior of random variables {— Xi,...,i = l,...,n}. When the underlying distribution is 
symmetric about zero, one may take {\Xi\, . . . ,i = l,...,n} in effect doubling the sample size. 
Theorem 3.1 and Corollary 4.2 in Rojo (1996) provide the results needed to develop methodologies 
to test the hypothesis that data arises from a medium-tailed distribution against an alternative 
of a short- or long-tailed distribution based on the asymptotic distribution of the extreme spacing 
X( n ) — ^( n _i), where represents the k th order statistic from a random sample Xi, . . . , X n from 
F. The distribution F is assumed to be continuous and strictly increasing throughout this work. 

The organization of the paper is as follows: Sections 1 and 2 provide the introductory material 
and a brief discussion of classification schemes developed by Parzen (1979) and Schuster (1984). 
Section 3 discusses the most relevant results from Rojo (1996) and section 4 discusses a new test for 
tail-heaviness. The test is consistent against short- and long-tailed alternatives and the level of the 
test is point- wise robust. Simulation results indicate that for small sample sizes the test exhibits 
good control of the probability of Type I error, and has good power properties. A comparison 
with a test proposed by Bryson (1974) concludes that, although Bryson's test behaves well against 
distributions with linear mean residual life functions, its power is not good against distributions 
with quadratic mean residual life functions and its probability of type I error is close to 1 for the 
gamma and log-gamma distributions (which are medium-tailed) and hence may not be a good 
choice for the testing situation of interest in this work. A simulation study is discussed where that 
data is blocked to increase the power of the test. Finally, the methodology is illustrated by applying 
it to several published data sets: maximum discharge of the Feather river, glass breaking strength, 
and the Belgian Secure Re claim size data. 
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2. Classification Based on the Density-quantile Function. Parzen (1979) argued 
that many distributions have density-quantile functions of the form, 



(1) 



fQ{u) - (1 



a > 0, 



where / denotes the density function, Q is the quantile function (the left-continuous inverse of 
F), and g±(u) ^ g^iu) means that g\{u)/g2{u) tends to a positive finite constant as u — > 1. The 
parameter a is called the tail exponent, and Parzen (1979) defined a distribution to be short-, 
medium-, or long-tailed according to whether a < 1, a = 1, or a > 1. When a = 1 the relation 
indicated by ([I]) may be written, in many cases, more precisely as 

(2) fQ(u) ~ (1 - «){ln(l - u)- 1 } 1 -?, < < 1 

where /3 is a shape parameter. Let L denote a slowly varying function from the left at 1. The precise 
statement associated with equation ([!]) is provided by 

(3) fQ( U )=L(u)(l-u) a . 

Relationship ([I]) is motivated by results of Andrews (1973) for approximating the area under the 
tail of the distribution F. It turns out that this density-quantile representation applies for many 
common distributions but not all. An example from Parzen (1979) of a distribution which does 
not have this density-quantile representation is 1 — F{x) = exp(— x — .75 sin x). To ensure that 
(|3j) holds, one must restrict attention to tail- monotone densities as discussed by Parzen (1979). In 
addition, Parzen (1979) states that the lognormal distribution is an example with a = 1 but an 
expression for its density-quantile function similar to ^ is not possible. Table 1 (see Parzen (1979)) 
gives the density-quantile function and classification for many common distributions. Although 
the classification scheme defined through ^ is of theoretical interest, a major drawback for our 
purpose is that it does not lend itself for a direct use in classifying distributions based on data. 
It is difficult to check the technical assumptions needed for ^ to hold and the various issues 
associated with estimating a density arise here as well. In some cases, it is possible to estimate the 
tail exponent a in (fiT) under a restricted form for fQ(u). For instance, under the assumption that 
fQ{u) ~ 7 _1 (1 — u^ +7 for 7 > 0, one can estimate 7 using, for instance, the commonly used Hill 
estimator. As discussed earlier, however, this approach is not without problems in the case that in 
fact fQ(u) ~ 7^(1 - u) l+ ^ for 7 > 0. 

Table 1 

Density-quantile Functions for Various Common Distributions 



Distribution 
Uniform(0,l) 
Exponential^) 
Logistic 
Weibull( 7 ) 
Extreme Value 
Normal 
Cauchy 
Pareto(7) 

Burrr(7, r) 



Density-quantile function, fQ(u) 
1 

uil — it) 

7(l-u){Z0£( T ^y} 1 ^ 

{l-u)logj^ 
,exp{-^-\u)f} ~ (1 - u){2lo 9T ^)l 
— sin 2 ttu ~ (1 — u) 2 

^(1 ~") 1+7 



Classification 

Short 
Medium 
Medium 
Medium 
Medium 
Medium 

Long 

Long 

Long 



However, as the classification based on ([T]) yields the same results as those obtained using the 
ETT, when the necessary technical conditions apply for both schemes (see Parzen (1980)), method- 
ologies based on the asymptotic distribution of the standardized maximum can be used to classify 
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distributions based on a random sample. In this case, however, the challenge is to come up with the 
correct sequences of constants to standardize the maximum. As F is unknown, these will have to be 
estimated from the data thus complicating the analyses. Section 4 discusses results that circumvent 
many of these technical problems. The resulting methodology is easy to implement and possesses 
good operating characteristics. Another possible drawback of a classification scheme based on Q 
is that the "medium-tailed" category may be too large as discussed next. 

3. Refinements and a Classification based on Extreme Spacings. As the clas- 
sification scheme based on is not sensitive enough to distinguish, for example, among the 
normal, exponential, and lognormal distributions, Schuster (1984) refined Parzen's density-quantile 
approach for the medium tail class. This refinement classifies distributions such as the normal, 
exponential, and lognormal into separate categories. The following definitions using the limiting 
value of the failure rate function tf(x) = f(x)/{l — F(x)} give the following Refined-Parzen (RP) 
Classification. Let 

(4) a= lim - (1 - u)f'Q(u)/[fQ(u)] 2 , and 



(5) c = lim (X-u)/fQ(u) = lim l/r F (Q(u)). 

A distribution belongs to one of the following categories when the given conditions hold: 

Short a < 1 

Medium-Short a = 1 c = 
Medium-Medium a = 1 0<c<oo 
Medium-Long a = 1 c = oo 
Long a > 1. 

The RP method classifies the normal, exponential, and lognormal distributions as medium- 
short, medium-medium, and medium-long respectively. Unfortunately, as with Parzen's classifica- 
tion scheme, the RP classification cannot be implemented easily to classify distributions from data, 
as it requires estimating fQ(u) and f'Q(u) for values of u close to one. 

Schuster (1984) provided a scheme to classify distributions by tail behavior through the asymp- 
totic behavior of the extreme spacing (ES). That is, the difference between the maximum and 
second largest data point. When the quantile function Q is differentiable in an open left interval of 
1 and if c defined by ^ exists, Schuster categorizes distributions by the ES as follows. 

Theorem 1 Let X\,X<i, X n be a random sample from the distribution F{x). Define S n = — 
and assume that c defined by equation ^ exists. Then, 
(i) c = if and only if S n = o p (l), 

(ii) c = a, < a < oo if and only if S n = O p (l), S n ^ o p (l), 
(Hi) c = oo if and only if S n A- oo, 

where o p (l) denotes the sequence of random variables converges to zero in probability, and O p (l) 
means that the sequence is bounded in probability. 

The distribution F is then said to be ES short, ES medium, or ES long, when (z), (ii), or (Hi) 
hold respectively. Theorem 1 makes the connection between the behavior of the extreme spacing 
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and the limiting behavior of the failure rate function when c defined by ^ exists. The failure rate 
goes to zero, (e.g. Pareto distribution F{x) = x~ a , with a > 0), if and only if the extreme spacing 
S n converges to infinity in probability, and the failure rate goes to infinity (e.g. F(x) = e~ £X ) if and 
only if the extreme spacing goes to zero in probability. Otherwise, the failure rate converges to a 
finite positive value (e.g. F(x) = e~ x ) if and only if the extreme spacing does not converge to zero 
but remains bounded in probability. 

Schuster (1984) also made a connection between the RP classification and the ES classification 
method. Using the density quantile representation ([3]) and properties of slowly varying functions, 
it follows that 



(6) 



lim 

u— >1~ 



fQ{u) 

l-U 



lim L(u)(l 

u— >1~ 



\a-l 



if a > 1, 

< lim^^- L(u) < oo 

oo, if a < 1. 



if a = 1, 



Therefore the ES short category consists of the RP short and RP medium-short categories; the 
ES medium category corresponds to the RP medium-medium class; and the ES long category 
consists of the RP medium-long and RP long categories. 

Thus, equation ^ provides a simple intuitive interpretation of the ES classification. As long as 
the appropriate assumptions are upheld, a distribution is 

ES short if 1 - F(x) -> faster than f(x) 0, 
ES medium if 1 — F(x) — > at the same rate as f(x) — > 0, 
ES long if 1 — F(x) — > slower than f(x) — > 0. 

The Weibull distribution provides an example where depending on the value of the shape pa- 
rameter, the Weibull distribution may be short-, medium-, or long-tailed. 




Example 1: For the Weibull distribution F(x) = e x "' , 

(7) lim i_^ =7 -i[-ln(l-n)]^ 1 = 

«-s>i- jQ{u) 

Therefore the Weibull distribution is ES short for 7 > 1, ES medium for 7 = 1, and ES long for 
7 < 1. 

Additional examples of distributions classified according to the asymptotic behavior of the ex- 
treme spacing are given in Table 2. 

There is still some degree of lack of precision in Theorem 1 , and the information provided by ^ , 
in the sense that the case (ii) in Theorem 1 and the case of a = 1 in Q includes medium-, short-, 
and long-tailed distributions. The connection between the asymptotic behavior of the failure rate 
and tail-heaviness of the distribution F will be made precise in Lemma 5 below. 

The ES classification method suggests the possibility of utilizing the asymptotic behavior of S n 
to differentiate among short, medium, and long-tailed distributions, but more specific results on 
the asymptotic distribution of S n are needed. 

4. Tail Classification using the Residual Lifetime Distribution. Rojo (1996) pro- 
posed a classification scheme based on the asymptotic behavior of the residual life distribution. 
This approach circumvents many of the technical assumptions required by previous approaches, 
and provides a more precise characterization of class membership. 
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Definition 2 Define h(t) = lim H x (t) = lim A. ^ , i > 0, when the limit exists. The distribution 

function F is considered short-tailed if h(t) = 0, medium-tailed if < h(t) < 1, and long-tailed if 
h(t) = 1, for all t. 



Although the limit h(t) given in Definition[2]exists for a fairly large class of distribution functions, 
this is not always the case. The limit does not exist, for example, when there is an oscillatory 
behavior in the tail of the residual life density. Examples of distributions for which h(t) does not 
exist include: F(x) = exp(— x — .75*sin(x)) and F(x) = c(l + (1 + x)~ 1//2 +sin((l + x) 1 ^ 2 ))e~ x , x > 0, 
where c = (2+sin(l)) _1 . 

The following results are consequences of Definition [2] Theorem [3] below combines Theorem 4.1 
and Corollary 4.2 in Rojo (1996). Hereafter, Exp(9) will denote the exponential distribution with 
parameter 6, i.e. mean I. 



THEOREM 3 Let F(x) be a distribution function and let h(t) be as in Definition^ Then, 

F is short-tailed <J=4> S n = Xt n ^ — Xr n _^ -4 0, 
F is medium-tailed <^=^> S n = Xr n ) — X( n _ 1 ) ^> Exp(0), 
F is long-tailed <^=^ S n = Xr n \ — X/ n _i\ ^> oo. 

Note that the results of Theorem [3] provide a more precise characterization of the various classes 
of distributions which are in agreement with a classification based on the asymptotic behavior of 
the extreme spacing S n . More importantly, Theorem [3] delineates the asymptotic distribution of the 
ES for the medium class. This is precisely the result that will lead to a methodology for testing the 
hypothesis of medium tail against either short- or long-tails in the next section. The asymptotic 
distribution of S n for medium-tailed distributions is perhaps not surprising since for the baseline 
medium distribution, the exponential distribution, S n has an exponential distribution, for every n, 
with the same parameter as the underlying distribution F. (see, e.g. Barlow and Proschan (1996)). 



Example 1 (cont): For the Weibull(7) distribution 
(8) h(t) 



lim Eit±H = lim _ ) 



x^oo F( x ) 



1 < 7 < 1, 
7 = 1, 
7 > 1. 







Therefore, for 7 = 1, the Weibull(l) distribution is medium-tailed. 

For < 7 < 1, it is long-tailed, and for 7 > 1, the Weibull distribution is short-tailed. 



Example 2: The Pareto(7) distribution has 

F(t + x) , (x + i) 7 
(9) hit) = lim _ ' = lim v 1 = 1 

x— >oo F(x) >oo 

for all t > 0. Therefore the Pareto distribution is ES long by Definition [2j 

It is possible to refine the classification given in Definition [2] by subdividing the short- and long- 
tailed distributions into three subclasses. This can be done by considering, instead, the asymptotic 
behavior of M(x) = F (e F ^) in the short-tailed case and the behavior of N(x) = F (— l/lni ? (x)) 



6 



Table 2 

Refined Parzen (RP), Extreme Spacing (ES), and Rojo's Classifications of 



Distributions by Tail Behavior 



Distribution 


RP 


ES 


Rojo 


Exponential 


Medium-Medium 


Medium 


Medium 


Normal 


Medium-Short 


Short 


Weakly- Short 


Lognormal 


Medium-Long 


Long 


Weakly-Long 


Uniform 


Short 


Short 


Super-Short 


Cauchy 


Long 


Long 


Weakly-Long 


Extreme Value 


Medium-Short 


Short 


Moderately-Short 


Pareto (a < 1) 


Long 


Long 


Super-Long 


Pareto (a = 1) 


Long 


Long 


Moderately-Long 


Pareto (a > 1) 


Long 


Long 


Weakly-Long 


Weibull (a < 1) 


Medium-Long 


Long 


Weakly-Long 


Weibull (a = 1) 


Medium-Medium 


Medium 


Medium 


Weibull (a > 1) 


Medium- Short 


Short 


Weakly- Short 


Logistic 


Medium-Medium 


Medium 


Medium 


Standard Extreme Value 


Medium- Short 


Short 


Moderately-Short 



in the long-tailed case. Table 2, as given in Rojo (1996), classifies several common distributions 
using the various schemes discussed so far. 

Note that the classification scheme based on the asymptotic behavior of the extreme spacings, and 
consequently the residual life function, is location and scale invariant. Henceforth, short—, medium— , 
and long — tail will mean tail-heaviness in the sense of Theorem 3. 

5. Testing for an ES medium tail. Let X\,X2, . . ■ ,X n represent a random sample from 
the distribution function F, and let F n denote the empirical survival function. Consider the test 
statistic 



(10) 



h\F n {\uX {n) )(X {n) - X (n _ 1} ) 



lnX 



(n) 



This section examines the operating characteristics of this statistic in the context of testing 
hypotheses about the tail behavior of F. 



The intuition guiding the choice of (10) as a test statistic arises from the following argument 
starting with a result of Rojo (1996). Since for a medium-tailed distribution X^ — X{„_i) — >• 
Exp(#), with probability one, for some 6 > 0, with 6 unknown, the need arises to estimate 6 to 
construct a test statistic whose asymptotic distribution does not depend on the unknown 6. Now 
note that, see Rojo (1996), for a medium-tailed distribution, 

(11) FQny) = y-°l(y), 9 > 0, 

where l(y) is some (unknown) slowly varying function. Therefore 



( 12 ) -lnF(lny) _ e _ In l{y) 

In y In y 

The slowly varying function I becomes a nuisance here, but fortunately it disappears in the limit 
since ln(/(y))/ln(y) converges to zero as y — > oo. Therefore, 

(13) 0= hm- ln f lnj/) , 

j/->oo In y 



and 9 may be estimated as follows 



2 _ lnF w (lnX (n) ) 

(14) 0n ~ — 

The consistency of the estimator 9 n is an immediate consequence of the following theorem. 
Theorem 4 Suppose that F '(y)/(F '(In y)) s — > as y — )• oo /or some 5 > 2. Then, 

(15) -lnF n (lnX (n) ) ^ i 

-hiF(lnX (n) ) 



The consistency of 9 n follows from Theorem 4 after multiplying and dividing expression ( 14 ) by 



lnF(lnX( n )) and then using (13). 



Since 9 is the scale parameter of the limiting distribution of -X7 n ) — -^(n-i) i n the medium-tail 



case in Theorem 3, the test statistic given by (10) has an asymptotic exponential distribution with 



mean 1. That is, under the assumptions of Theorem |4j 

(16) " H) V (n) - *(n-l)) 4 Exp{l) 



under an ES medium-tailed distribution. To use (16) as a basis for a test for the hypothesis of 
medium tail, it must be verified that the conditions of Theorem [4] hold for medium-tailed distri- 
butions. It turns out that F (y)/(F '(In y)) s — > as y — > oo for some 5 > 2 holds for all medium- 
and for most short- and long-tailed distributions. This follows from the following lemma which 
assumes that F has density /. Let the failure rate, f(x)/F(x), associated with F be denoted by 
vf{x), and let R-oo denote the set of rapidly varying functions so that / G R-oo if an d only if 
f(Xx)/f(x) converges to zero or oo, as x — > oo, depending on whether A > 1 or A < 1, respectively. 
The following lemma is instrumental in proving one of the main lemmas in this section. 

Lemma 5 Suppose that F has a density f and let rp = f / F denote its failure rate. Then, 

(i) The distribution F is short-tailed if and only if rp{t) —> oo as t — > oo. 

(ii) The distribution F is medium-tailed if and only if rp{t) — > as t — > oo, for some < 9 < oo. 
(Hi) The distribution F is long-tailed if and only if rp(t) — > as t — >■ oo. 

The following Lemma shows that the condition F(y) / (F(\n(y))) s — > 0, is achieved by all medium- 
tailed and most long- and short-tailed distributions. 

Lemma 6 Let F be short-, medium- or long-tailed so that i^lnx) £ R-oo, F(\nx) = x~ e l\(x), or 
-F(lnx) = l2{x), respectively, for some slowly varying functions h(x), i = 1, 2 and some 9 > 0. 

(i) If F is long-tailed, then, without loss of generality, F(lnx) = expjj^ ^p-dt}, with e(t) — > 
as t — > oo, so that F(lnx) is a normalized slowly varying function and rp(lnx) = —e(x). 

(ii) If F is short-tailed, then without loss of generality, F(lnx) = explj*-^ ^^-du}, for some z(u) — > 
—oo and then, rp(lnx) = — z(x). 

(Hi) Let F be short- or long-tailed with z[x)j z(e x ) = o(x) or e(x)/e(e x ) = o(x) respectively; or 
suppose that F is medium-tailed. Then, 



(17) 



F(y)/{F{\ay)) S ->• as y ->• oo for all 5 > 0. 
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Recall that a distribution G is in the domain of attraction of the Frechet distribution if and only 
if it satisfies the von-Mises condition ( yrc{y) —> A for some A > 0), or if G is tail — equivalent to one 
such distribution. Therefore, if F is long-tailed, the conditions that e(lnx)/e(x) = o(x) is satisfied, 
in particular, by all those distributions satisfying the von-Mises condition. Thus, for example any 



F with F £ R—a with density / eventually monotone satisfies (17). In the case of short-tailed 
distributions, any distribution F for which — \nF(x) is convex in a neighborhood of infinity, will 
satisfy the condition that z(lnx)/ z(x) = o(x) since then rp is increasing in such neighborhood and 
the result follows since rp(lnx) = —z{x). An example of a short-tailed distribution that satisfies 
the conditions on z without having increasing failure rate in a neighborhood of infinity is provided 
by a distribution F with failure rate rp(x) = x + ln(x) * (1 + sin(x))/x 1 / 5 . 

A level a test of the hypothesis of an ES medium-tailed distribution can, therefore, be introduced 



using the asymptotic behavior of the test statistic T n , defined by (10), with critical values being 
the a and 1 — a percentiles of the Exp(l) distribution. The null hypothesis to be tested is that F 
is ES Medium-tailed vs either of the alternative hypotheses H ai : F is ES Short-tailed or H a2 : F 
is ES Long-tailed. The decision rules with significance levels a are: (1) Reject Hq in favor of H ai if 
T n < — ln(l — a), and (2) Reject Hq in favor of H a2 if T n > — ln(a). Otherwise do not reject H a . 

It is clear that the asymptotic levels of the tests equal a, and, thus, the test has point-wise robust 
levels. The following two theorems prove consistency against short- and long-tailed alternatives. In 
the case of short-tailed alternatives we further sub-classify them, as in Rojo (1996), according to 
the asymptotic behavior of the cumulative hazard function. Thus, a short-tailed distribution F is 
said to be super-, moderately-, or weakly-short when — ln-F(lnx) is rapidly-, regularly-, or slowly- 
varying. The following Theorem provides the consistency of the test for short-tailed alternatives. 
In the case of weakly-short distributions a mild additional condition is needed to get the result. 

Theorem 7 Let F be short-tailed so that —lnF(lnx) = h(x) with h a regularly varying function 
with index < 7 < 00. When 7 = 0, suppose in addition that T -fj^y — > as x — > 00. Under the 
assumptions of Theorem^ the test defined by the test statistic T n that rejects when T n < — ln(l — a) 
is consistent against the class of short-tailed alternatives. 

The condition that 1 ~f~n^ — > as x — > 00 when F is a weakly — short distribution is rather 

mild, and it is satisfied, for example, by all distributions with survival functions of the form F(x) = 
exp(—x( 1+a ^), for a > 0; F(x) = exp(—xlni, x), where ln^ denotes the kth iterated natural log; and 
F = exp(- exp(x Q )), < a < 1. 

Similar results hold for long-tailed distributions as stated in the next theorem. As in the case of 
short-tailed distributions, a condition is imposed on the tail of F and it is seen that this condition 
is satisfied by a large class of distributions with either regularly or slowly varying tails. The case 
of long-tails with F rapidly varying, e.g. F{x) = Exp{—{x) a ) with < a < 1, seems to also satisfy 
the condition as demonstrated by many examples, but we are unable to prove the result in general. 

Theorem 8 Let F be long-tailed and suppose that rj?(x) is eventually decreasing with 

(18) -^T-*)- 

Under the assumptions of Theorem [7J the test defined by the test statistic T n that rejects when 
T n > — ln(a) is consistent against this class of long-tailed alternatives. 

Other examples that satisfy the conditions of Theorem ^include: F(x) = e - ( ln:c ) a , for < a < 1, 
and regularly varying functions of the form F(x) = ^p-dt, where —s(t) — > a > and —s(t) is 
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Table 3 

Type I errors - Exponential and Logistic Distributions 



n 


£(100) S 


£(100) L 


£(1) S 


£(1) L 


£(.01) S 


£(.01) L 


Lgis S 


Lgis L 


G(.7) S 


G(.7) L 


10 


.0443 


.0000 


.0541 


.0805 


.5742 


.1146 


.0382 


.1427 


.1057 


.1113 


50 


.0509 


.0001 


.0485 


.0563 


.1085 


.0686 


.0447 


.0730 


.0387 


.0979 


100 


.0515 


.0017 


.0491 


.0517 


.0675 


.0642 


.0477 


.0623 


.0428 


.0928 


250 


.0532 


.0053 


.0506 


.0541 


.0530 


.0590 


.0471 


.0559 


.0382 


.0908 


500 


.0536 


.0085 


.0542 


.0517 


.0541 


.0545 


.0490 


.0594 


.0384 


.0859 


1000 


.0544 


.0100 


.0501 


.0503 


.0526 


.0546 


.0462 


.0568 


.0411 


.0904 


2500 


.0554 


.0153 


.0514 


.0507 


.0525 


.0502 


.0480 


.0572 


.0417 


.087 


5000 


.0523 


.0179 


.0498 


.0463 


.0477 


.0499 


.0428 


.0590 


.0392 


.0853 


10ft 


.0514 


.0200 


.0516 


.0458 


.0526 


.0549 


.0476 


.0558 


.042 


.0906 


20fc 


.0544 


.0248 


.0487 


.0508 


.0498 


.0459 


.0485 


.0546 


.0377 


.083 



decreasing. It is possible to obtain the consistency results for regularly varying tails by replacing 
the conditions Theorem [8] by a different condition. This is the content of the following theorem. 

Theorem 9 Let F be long-tailed with F regularly varying of exponent a > 0, so that F(x) = 
c(x)exp(f* ^p-dt) = L(x), where e(t) — > —a, and suppose that c(x) — > c > 0, with c{x) nonde- 
creasing, and —L'/L eventually non-increasing. Then, when Theorem 4 holds, the test defined in 
the previous theorem is consistent against these long-tailed alternatives. 

Thus the test defined by T n is consistent against short- and long-tailed alternatives. These are 
asymptotic results. The next section provides results from a simulation study that examines the 
power properties for finite sample sizes. 

6. Simulation Results. The previous section discussed the asymptotic properties of Type 
I error control and consistency of the test against long- and short-tailed distributions. This section 
investigates the type I error, and power properties of the test for finite samples from various 
distributions. 

Table 3 gives the rejection probabilities when sampling from various exponential distributions 
as well as the logistic distribution and the gamma distribution with scale=l and shape=.7. The 
values given are Type I errors since all distributions are ES medium-tailed. The probabilities for 
each sample size are found from 10,000 simulations of the chosen sampling distribution. 

Table 3 shows good performance of the test statistic — lnF n []nXr n \](Xr n \ — Xr n ^)/ lnXr n \. 
Most of the values are close to the desirable value of a = .05 except in the Exp(100) and in the 
Gamma(.7) cases, 

as well as for a few instances, Exp(l) and Exp(.Ol), when the sample size was extremely small, 
n = 10. 



The case of Gamma(.7) illustrates the fact that the convergence of ln/(y)/ln(y) in (12) can be 
very slow. When this happens, the estimator — lni^ n [mX( n )]/mX( n ), although converging to 6, it 
will do so rather slowly and this will be reflected on the probability of Type I error as seen in Table 
3. Despite this, the test statistic performs very well under various ES medium-tailed distributions. 

We now turn our attention to the power of this test statistic when sampling from various ES 
short- and long-tailed distributions. Besides tracking the power of detecting an ES short- or long- 
tailed distribution, it may be just as important to notice the probability of a serious misclassification 
error. A serious misclassification error is one in which an ES short-tailed distribution sample has 
been classified as long-tailed or vice-versa. Thus, both ES short and long percentages are given. 
Again 10,000 simulations were used for each sample size. Table 4 gives classification probabilities 
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Table 4 

Power for testing against short- and long tails - Pareto 



n 


P(l) S 


P(l) L 


P(2) S 


P(2) L 


P(5) S 


P(5) L 


P(10) S P(10) . 


10 


.0126 


.5341 


.0281 


.2677 


.0433 


.0411 


D4Q9 


D097 


50 


.0047 


.8051 


.0151 


.4547 


.0346 


.1678 


.0439 


.0508 


100 


.0024 


.8807 


.0119 


.5364 


.0289 


.2035 


.0416 


.0856 


250 


.0017 


.9420 


.0069 


.6372 


.0282 


.2322 


.0396 


.1169 


500 


.0007 


.9673 


.0070 


.7073 


.0227 


.2660 


.0327 


.1324 


1000 


.0003 


.9809 


.0051 


.7726 


.0219 


.2990 


.0352 


.1431 


2500 


.0002 


.9921 


.0029 


.8375 


.0191 


.3562 


.0305 


.1589 


5000 


.0001 


.9968 


.0033 


.8721 


.0171 


.3865 


.0300 


.1733 


lOfc 


.0000 


.9977 


.0019 


.9030 


.0164 


.4243 


.0270 


.1930 


20k 


.0000 


.9993 


.0001 


.9284 


.0140 


.4571 


.0248 


.2020 










Table 5 










Power for testing against short- and long tails 


- Weibull 






n W(5) S 


W(5) L 


W{2) S 


W{2) L 


W(l/2) S 


W{\/2) L 




10 


.7008 


.0000 


.1678 


.0011 


.0161 


.3936 






50 


.7589 


.0000 


.1699 


.0001 


.0127 


.5050 






100 


.7862 


.0000 


.1757 


.0000 


.0087 


.5450 






250 


.7999 


.0000 


.1711 


.0001 


.0101 


.5855 






500 


.8071 


.0000 


.1840 


.0000 


.0085 


.6171 






1000 


.8177 


.0000 


.1806 


.0000 


.0069 


.6500 






2500 


.8342 


.0000 


.1851 


.0000 


.0068 


.6698 






5000 


.8428 


.0000 


.1943 


.0000 


.0067 


.6844 






lOfc 


.8497 


.0000 


.1910 


.0000 


.0072 


.7020 






20k 


.8519 


.0000 


.2020 


.0000 


.0049 


.7158 





for various shifted Pareto^) distributions with survival functions F(x) = 1+ 1 a7 , x > 0. The test 
statistic shows great power against Pareto{\) and Pareto{2) alternatives. As expected from the 
results of Heyde and Kou (2004), power decreases as 7 increases. 

Table 5 gives the classification probabilities for a Weibull^) distribution with survival function 
F(x) = e~ xl , x > 0. As stated previously, the Weibull is ES short-tailed for 7 > 1 while ES long for 
< 7 < 1. The simulations show good power against the Weibull(5) distribution and reasonable 
power for the Weibull(l/2) distribution. The power decreases as 7 nears 1 as expected. 

Table 6 shows the power against U(0, 1), extreme value, and normal distributions. For a sample 
size of 100 or larger. The test is almost perfect for detecting a U (0, 1) sample as ES short. The 
power is unfortunately not that high for the extreme value and normal distributions. At least in 
both cases for n > 10 the percentage ES short classifications did outnumber the ES long ones, but 
the percentage of simulations that were rejected as ES medium and classified as short-tailed from 
a normal distribution was approximately 10% for a sample size as large as 5000. The lack of power 
in this case is addressed in the next section. 

Finally Table 7 gives the power of the test for a few common ES long-tailed distributions. The 
percentage of correct classification for the lognormal is less than desirable, slightly better for a t(3) 
sample, while excellent for a Cauchy sample. 

7. Blocking the Data for Increased Power. The previous section introduced a test 
to distinguish among ES short-, medium-, and long-tailed distribution samples by tail behavior, 
using the Extreme Spacing. The test shows good power in distinguishing significantly different 
tail behaviors. But the test showed less capability for distinguishing a lognormal sample from an 
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Table 6 

Power against short- and long-tails - sampling from ES short distributions 



n 


Unif(0, 1) S 


Unif(0, 1) L 


ExtVal S 


ExtVal L 


Normal S 


Normal L 


10 


.3194 





.1145 





.0465 


.0533 


50 


.7766 





.1308 


.0002 


.0629 


.0148 


100 


.9459 





.1467 





.0666 


.0083 


250 


.9988 





.1722 





.0816 


.0041 


500 


1 





.1816 





.0846 


.0029 


1000 


1 





.1914 





.0913 


.0018 


2500 


1 





.2077 





.0952 


.0018 


5000 


1 





.2283 





.0984 


.0006 


lOfc 


1 





.2335 





.1018 


.0006 


20fc 


1 





.2401 





.1032 


.0005 



Table 7 

Power against short- and long-tails - sampling from ES Long distributions 



n 


Lnorm S 


Lnorm L 


t(3) S 


t(3) L 


Cauchy S 


Cauchy L 


10 


.0426 


.1775 


.0353 


.1833 


.0154 


.4640 


50 


.0253 


.2828 


.0265 


.2549 


.0054 


.7187 


100 


.0200 


.3329 


.0242 


.3022 


.0028 


.8161 


250 


.0187 


.3910 


.0165 


.3741 


.0016 


.9042 


500 


.0154 


.4437 


.0159 


.4322 


.0008 


.9437 


1000 


.0134 


.4953 


.0135 


.4934 


.0005 


.9701 


2500 


.0089 


.5441 


.0091 


.5740 


.0002 


.9855 


5000 


.0104 


.5826 


.0087 


.6363 


.0002 


.9935 


lOfc 


.0074 


.6292 


.0076 


.6879 


.0000 


.9935 


20fe 


.0063 


.6574 


.0058 


.7361 


.0000 


.9977 



exponential sample and a normal sample from an exponential sample. This section addresses the 
low power values seen in the previous section when sampling from distributions with tails which 
do not differ much from the exponential. The procedure of blocking the data, finding the test 
statistic for each block, and combining the block test statistics into one test increases the power 
substantially. 

Notice from Table 6 that the power of detecting a ES short-tail when sampling from a normal 
distribution is approximately 10% for n < 20, 000. Blocking the data into k separate blocks may give 
rise to additional power. Each block of size approximately m = ? is its own independent subsamplc 
which will produce independent values for the test statistic T m . Under the null hypothesis of an ES 
medium tail, the sum of the k block statistics can be used as the overall test statistic. 

Under the null hypothesis, the sum of the k block statistics has an asymptotic gamma(k, 1) 
distribution. Let TSj be the block test statistic for block j where j = 1, The hypotheses 
to be tested and corresponding decision rules are then given by H Q : F is ES Medium-tailed vs 
H ai : F is ES Short-tailed or H a2 : F is ES Long-tailed. The decision rules with significance level 
a = 5% is Reject H Q in favor of H ai if T^ =l TSj < qgamma(a, fc, 1); Reject H Q in favor of H a2 
if T,j =1 TSj > qgamma(l — a, k, 1); otherwise do not reject H Q , where qgamma(p, k, 1) is the pth 
percentile of the gamma(k,l) distribution. 

Table 8 gives the Type I errors found when sampling from the Exp(l) and logistic distributions 
for the sample sizes of 500, 5000, and 20000 using various numbers of blocks. As shown in the table 
the suggested number of blocks to use is somewhere between 5 and 10, otherwise too few points 
are in each block leading to large Type I errors. For a sample size of 5000, it appears that up to 25 
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Table 8 

Type I Errors while Blocking the Data for Sample Sizes n = 500 and 5000, and 20000 
blocks E500S E500L L5005 L500L £50005 E5000L L5000S L5000L 



1 A 1 

1 .0541 


AT ( r 

.0545 


.0490 


.0594 


.0498 


.0463 


n a oo 
.0428 


.0590 


5 .0497 


.0538 


.0365 


.0823 


.0549 


.0488 


.0381 


.0639 


10 .0509 


.0660 


.0246 


.1218 


.0499 


.0546 


.0332 


.0734 


25 .0393 


.1058 


.0050 


.3469 


.0463 


.0558 


.0252 


.0994 


50 .0313 


.1344 


.0048 


.8164 


.0455 


.0674 


.0122 


.1691 




blocks 


Exp20k S 


Exp20k L L20k S L20k L 








1 


.0487 


.0508 


.0485 


.0546 








5 


.0481 


.0500 


.0457 


.0597 








10 


.0510 


.0507 


.0382 


.0692 








25 


.0498 


.0544 


.0300 


.0861 








50 


.0485 


.0555 


.0214 


.1177 








100 


.0437 


.0604 


.0097 


.1885 







Table 9 

Power* against Short- and Long-tailed alternatives when Blocking the Data; n = 500 



blocks 


Norm S 


Norm L 


ExtVal S 


Lnorm L 


Par(5) L 


Weib(2) 


1 


.0846 


.0029 


.1816 


.4437 


.2660 


.1840 


5 


.1653 





.7722 


.7485 


.4439 


.8473 


10 


.1887 





.9635 


.9721 


.5113 


.9941 


25 


.0950 


.0133 


.9978 


.9483 


.4740 


1 



*Not shown: Power of when testing for long tails and the distribution is Extreme Value, or Weibull(2); 
Not shown: Power < .01 when testing for short tails and sampling from Lognormal, or Pareto(l) 

Table 10 

Power* against Short- and Long-tailed alternatives while Blocking the Data; n = 5000 



blocks Norm S ExtVal S Lnorm L Par(5) L Weib(2) S 



1 


.0984 


.2283 


.5826 


.3865 


.1943 


5 


.3120 


.9406 


.9287 


.7041 


.8907 


10 


.5172 


.9994 


.9921 


.8372 


.9986 


20 


.7339 


1 


.9996 


.9404 


1 


50 


.8960 


1 


1 


.9924 


1 



*Not shown: Long Tail Classifications for Ext Value, Weibull(2), Normal; less than 1% Short Tail Classifications 

for Lognormal, Pareto(l) 

Table 11 

Power against Short- and Lon-tailed alternatives while Blocking the Data; n = 20000 



blocks Norm S ExtVal S Lnorm L Par(5) L Weib(2) S 



1 


.1071 


.2442 


.6573 


.4607 


.2080 


5 


.3804 


.9771 


.9715 


.8222 


.9160 


10 


.6552 


1 


.9975 


.9360 


.9985 


20 


.9435 


1 


1 


.9902 


1 


50 


.9966 


1 


1 


.9999 


1 



blocks can be used without causing significant Type I errors. In what follows En stands for Exp(l) 
sampling with sample size n and Ln stands for Logistic sampling with sample size n. 

Tables 9-11 show that for as little as 5 or 10 blocks the power of detecting an ES short- or 
long-tailed sample can increase substantially. 
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For a sample size of n = 500 blocking the data (Table 9) increases the correct classification 
to a desirable value, > 90%, for the extreme value, lognormal, and W eibull{2) distributions. The 
power increases for the normal and Pareto(5) distributions. The reason why the percentage does 
not increase over 90% is twofold. First, the normal and Pareto(5) are very similar to an ex- 
ponential in tail behavior, and since there are few points in each block, the standard error of 
{— ln.F n [lnX( n \]/ lnX( n )} increases. In other words, it is difficult with a sample size as small as 
500 to be able to consistently distinguish a normal or Pareto(5) tail from an exponential. The 
selection of the number of blocks is driven by a trade-off between bias and power of the tests. Table 
10 does show significant power improvement for a sample size of 5000. More improvement is shown 
in Table 11 for n = 20,000. 

8. Comparison with Bryson test. Bryson (1974) proposed a procedure to test the hy- 
pothesis of an underlying exponential distribution against long-tailed distributions with (increasing) 
linear mean residual lifetime functions. Examples of these long-tailed distributions are the Lomax 
distributions. Based on invariance considerations, the Bryson test is defined as 

(19) T* = XX ^ 2 , 

(n-l)X GA 

where 

x GA = (n? =1 (Xi + A n )) 



l/n 



with A n = X( n )/(n — 1). It follows from (19) that the asymptotic behavior of the test based on 
T* will be affected by the asymptotic behavior of X^ n y One drawback of Bryson's test is that its 
asymptotic distribution is not known and the critical values have to be simulated. Bryson (1974) 
provides the critical values for several small sample sizes and three levels for the test (a = .01, .05 
and .10). For the purpose of the present problem, this means that since we do not know that the 
levels of the test defined through T* are robust (for the class of medium-tailed distributions) then it 
is difficult to apply the test for our purposes. Nevertheless simulation work shows that the test has 
good power, sometimes higher power than the test proposed here, for those distributions for which 
it was developed. In addition, its power is competitive with the power of the test defined through 



( 20 ) below. Its main drawback, however, is that it may have probability of error of Type I close to 
1 when the underlying distribution is the gamma or log-gamma distributions. Thus the test may 
reject the null hypothesis of medium-tail in favor of a long-tail when the shape parameter of the 
gamma distribution is larger than 1; on the other hand, it will reject the null hypothesis in favor 
of a short-tailed distribution, with probability of error of Type I close to 1, in the case that the 
shape parameter of the gamma is smaller than 1. For the case of the log-gamma distribution, which 
is long-tailed, Bryson's test may reject in favor of the decision of short-tail with high probability. 
The reason this happens is that, since for the gamma distribution the centering sequence to achieve 
a limiting distribution for Xr n \ is given by logn + (a — 1) log log n — logr(a), then depending on 
whether a is smaller or greater than 1, the test statistic will favor a short-tail or long-tail alternative. 
The case of the log-gamma distribution follows in a similar manner. The following Table 12 shows 
the simulated quantiles for the distribution of the test statistic T* under the gamma distribution 
for various values of the shape parameter. In all cases, the scale parameter is set to 1. Table 13 
shows the simulated quantiles for the distribution of the test statistic T* under the log-gamma 
distribution for various values of the scale parameter. In all cases, the shape parameter is set to 
1/2. 
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Table 12 

Quantiles for the Bryson Test for Sample Sizes n = 50, 100, 500, 5000, 10000, and 20000 for the 

gamma Distribution: Shape=2, 1, 1/2; Scale=l 





.025 


.05 


.95 


.975 


.025 


.05 


.95 


.975 


.025 


.05 


.95 


.975 


50 


0.0611 


0.0643 


0.1246 


0.1336 


0.1035 


0.1104 


0.2371 


0.2546 


0.2042 


0.2212 


0.4588 


0.4852 


100 


0.0386 


0.0407 


0.0752 


0.0807 


0.0745 


0.0791 


0.1624 


0.1749 


0.1751 


0.1889 


0.3768 


0.3968 


500 


0.0114 


0.0119 


0.0194 


0.0205 


0.0271 


0.0283 


0.0506 


0.0543 


0.0947 


0.0996 


0.1794 


0.1902 


5000 


0.0016 


0.0016 


0.0024 


0.0025 


0.0044 


0.0045 


0.0071 


0.0075 


0.0224 


0.0232 


0.0368 


0.0390 


10000 


0.0008 


0.0009 


0.0012 


0.0013 


0.0024 


0.0025 


0.0038 


0.0040 


0.0133 


0.0137 


0.0212 


0.0224 


20000 


0.0004 


0.0004 


0.0006 


0.0007 


0.0013 


0.0013 


0.0020 


0.0021 


0.0077 


0.0079 


0.0119 


0.0126 



Table 13 

Quantiles for the Bryson Test for Sample Sizes n = 50, 100, 500,1000, 5000, 10000, and 20000 for 
the log-gamma Distribution: Shape=l/2; Scale=l/6, 1 





.025 


.05 


.95 


.975 


.025 


.05 


.95 


.975 


50 


0.0239 


0.0245 


0.0419 


0.0461 


0.0682 


0.0783 


0.6478 


0.7257 


100 


0.0132 


0.0136 


0.0240 


0.0267 


0.0624 


0.0722 


0.7082 


0.7866 


500 


0.0033 


0.0034 


0.0064 


0.0071 


0.0558 


0.0671 


0.7655 


0.8515 


1000 


0.0018 


0.0019 


0.0035 


0.0039 


0.0564 


0.0681 


0.7902 


0.8745 


5000 


0.0004 


0.0004 


0.0009 


0.0010 


0.0558 


0.0678 


0.8247 


0.9056 


10000 


0.0002 


0.0002 


0.0005 


0.0005 


0.0559 


0.0676 


0.8346 


0.9188 


20000 


0.0001 


0.0001 


0.0002 


0.0003 


0.0564 


0.0689 


0.8407 


0.9215 



The quantiles in Table 12 for shape=l, are the critical values used to implement Bryson's test. It 
becomes evident at once, that Bryson's test will reject, with probability close to 1, the hypothesis 
of medium tail in favor of short tail when the (medium-tailed) underlying distribution is gamma 
with shape equal to 2 and scale equal to 1 for sample sizes 100 or higher since the 97.5 th quantile 
for the test statistic in this case is smaller than the 2.5 th quantile of the test statistic under the 
standard exponential distribution. Similarly, the Bryson's test would reject the null hypothesis of 
medium tails with high probability if the true underlying distribution is gamma with scale = 1/2 
and shape=l. This is also obvious from Table 12 since the 97.5 th quantile for the test statistic under 
the null hypothesis is smaller than the 2.5 th quantile of the test statistic under the gamma with 
scale = 1/2 and shape=l. Similar observations hold for the case of the log-gamma distribution. 

9. Illustrations of Data Analysis. This section is devoted to the analysis of three data 
sets illustrating the methodologies of previous sections. The first data set is the Secura Belgian Re 
data set and consists of 371 automobile claims from 1988 - 2001 from numerous European insurance 
companies. Each claim was at least 1.2 million Euros. This data, adjusted for inflation is discussed 
in Beirlant et al. (2004). Figures 1 and 2 show the histogram and exponential q-q plot. Since 
the empirical quantiles in the right tail are greater than the corresponding exponential quantiles, 
the right tail appears to be longer than exponential, i.e. long-tailed. This has been confirmed by 
classical techniques in Beirlant et al. (2004). A Pareto-type distribution was fitted to the data and 
the long-tailed behavior of the data was also observed in the empirical mean residual life plots. 

We consider testing for a medium tail versus a long-tail. The data, expressed in millions of Euros, 
was first shifted by subtracting 1.2 million. The test statistic is 

lnF ra [lnA"( ra) ](A A ( ra ) - X^ n _^) 
\nX [n) 

For claims above 1.2 million Euros the value of the test statistic is 70.40. Since under the null 
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(20) 



Histogram of Secura Belgian Re 



Standard exponential qq-plot 
for Secura Belgian Re 






Figure 5 - Flood level in thousands of cubic feet per second 
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Figure 6 - Standard exponential quantiles 



hypothesis of an ES medium-tailed the test statistic is distributed like an Exp(l), the p- value is 
< .001. Thus the distribution is classified as long-tailed. 

The next example is depicted in Figure 3 and 4 that show the histogram of breaking strengths 
of 63 glass fibers of 1.5 cm in length. This data appeared in Smith and Naylor (1987) and the left 
tail was analyzed by Coles (2001). Here, the the right tail is considered. The q-q plot suggests, at 
first glance, a short right tail since the empirical quantiles fall below the exponential quantiles. 

The value for test statistic is .014 which under the null hypothesis of a medium right tail gives 
a p-value of .014. Therefore the test rejects the null hypothesis of a medium tail in favor of the 
alternative of a short right tail. 

The third example analyzes the annual maximum discharge, in thousands of cubic feet, of the 
Feather River from 1902 to 1960. This data set has been described and analyzed with classical 
extreme value methods by Reiss and Thomas (2000). A Gumbel distribution was fitted to the data. 
The histogram and q-q plot are shown in Figures 5 and 6. 

The q-q plot suggests a medium right tail. After subtracting the smallest observation from the 
data, the test statistic yields a value of .35 with a corresponding p-value of .7 therefore not rejecting 
the null hypothesis of medium-tail. 

10. Appendix. 
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Proof of Theorem 4J Suppose that F(y)/ (F(lny)) <5 — > for some S > 2, and let < 7 < |, 
e n = n 7 ~5 and define A n = {\\F n — F\\ > e n } and B n = {F(lnX n ) < e n }, where \\F n — F\\ = 
sup„ \F n (x) — F(x)\. Let Z n = - 1 P° and consider first, for e > 0, 

1 -hF(mA(„)) 

P{Z n >l + e) = P(Z n >l + e,A n )+P(Z n >l + e,A c n ) 
< 2e ~ 2n£ " +P(Z n >l + e, A c n ) 

Thus, it is enough to show that the second term in the last expression goes to zero as n — > 00. Now, 
the second term in the last expression may be bounded from above as follows: 

/ -ln(F(lnX (n) ) - e n ) \ 

P((z n >i + e)nAf l nB n ) + p VWf , >i + e)n^ni% 

V -^F{\nX (n) ) J 

( -ln(P(lnX fn -e n ) \ 

< P(B n ) + p[( — y*^ > ; >i + £ )n^np 

Now, note that P(B n ) = P (mX (n) ) > P _1 (£„)) = 1-P (x (n) < exp{P _1 (n 7 "5)} 

= 1 -{1 -F(exp{F _1 (nT- 5 )})}". 

But, nF ^exp{P 1 (n 7_ 2)}^ = o(l) as a consequence of the assumption that F(y)/ (P(lny)) 5 — > 

as y — > 00, for some 5 > 2. To see this, set k = 7 — 1/2 and u = F (n K ), so that u — > 00 as 
n — > 00, and nP(exp(P (n K ))) = (F(u)) 1 ^ K F(e u ). Finally setting y = e u , then the last expression 
is seen to be equal to F(y) /{F '(In y))~ l l K — > as y — > 00 since —1/k > 2. Therefore, P{B n ) — > as 
n — >• 00. It remains to prove that 

/ -\n(F(laX (n) ) - e n ) \ 

as n — > 00. 

^ V -lnP(lnX (n) ) n j 
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Write - In (F(lnX {n) ) - e n ) = - InF (lnX (n) ) + ^ where F (lnX (n) ) < l-£„ < F (lnX (n) ) + 
e„ so that, for < a < 1, after setting C n = H B^, 



/ ( -i.W^)- £ ,,) >i+ \ = / *. - >i + e) nc„ 

V -In^OnX^)) y y (1 -£„)(- lnF(lnX (n) ) 

" ^(l-£ n )(-lnF(lnX (ri)) ) J 



< P Y , " 1 W , Y > n {ni-~<F(lnX (n) ) > 1 + a} 

V (F{lnX (n) ) - e n )(-lnF(lnX (n) )) 



+ p(n5-TF(lnX (n) ) < 1 + aj 



< P\{— — = 1 = > e}n{n 1 2-^F(lnX (n) ) > 1 + a} 

V (n^^P(lnX (n) ))(-lnP(lnX {n) )) 

+ p(n^F(lnX (n) ) < l + a ) 

a(-lnP(lnX (n) )) y V / 



< P 



Since a > while — lnP(lnX(„)) — > oo almost surely, the first term on the right side of the last 
inequality goes to zero. For the second term, note that 



P (n^F(lnX (n) ) < 1 + a) =P (x {n) >exp{F 1 (cn^)}^ =1- [l- F(exp{F V 7 " 1 )}))" 

where c = 1 — a. It then follows as before, that P(y)/(P(lny)) 7 — > for some 7 > 2 implies that 
nF ^exp{P l (cn>~^)}^ — > as n — > 00. Therefore P (n^~ 6 'F(\nX^) < 1 + a) — >■ and hence 

/-lnP n (lnX/ ra O \ 
P 4^ > 1 + e -> 0. 

The case P ( hl F ™^ n X(n) < 1 — e ) = P (Z n < 1 — e) is handled in a similar fashion. 

Y -lnF(lnX (n) ) y 

Consider now 

P(Z n <l-e) = P({Z n <l-e)nA n )+P({Z n <l-e)nA c n ) 

^ -lnP(lnX (n) ) 7 

< 2e-^ + pf- ln(F(h !^ ) + £ - ) <l- £ V 
V -lnP(lnX (n) ) J 
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As before, write — ln(F(hiXr n \) + e n ) = — ln(F(lnX( n ))) — E n /£ n where £ n satisfies F(lnly) < 
£ n < F(lnX (n) ) +e n . Then 



/ -ln(F(lnX (w) )+ £n ) ^ i 



P I : ^ -<l-e 



lnF(lnX (n) ) I \ £ n (-lnF(lnX (n) )) 



= p i» > e 

^ n (-lnF(lnX (n) )) y 

< pf= 

" ^(lnX (n) )(-lnF(lnX (n) )) ^ 

- ^ ^ >: 

Vns-iFllnX^X-lnFynAV))) , 
£ P ( -.nFOnX,,,,) > E >"^W > >) +^(»--T(lnX w) < 1 

S K r ^^ >e ) +P( "" 7r(lnX, " ,)<l) - 

Similar arguments to those used before then yield the result that both terms on the right side of 
the above inequality go to zero as n — > oo. Thus, 

-lnF n (lnX (n) ) p 



lnF(lnX w ) 



(n)j 

Proof of Lemma [5j Without loss of generality it is assumed that F is a life distribution. The 
proof follows easily by writing, after a one-step Taylor's expansion 



(21) 



ln{ 



F(t + x) 
F(x) 



rt+x 

} = r F (u)du = t * rp(£), 

J X 



where x < f < x + t. Consider first (i). Then ^, + f^ — > 0, for all t as x 
to t * rp(£) 

some slowly varying function l(x). Using (12), it is clear that, by L'Hopital's rule, 

]nF(lny) 



oo. Thus (21) is equivalent 



oo, as x — t- oo for all t > 0. In the case of (ii), -F(lnx) = x °l(x) for some 6 > and 

,'Hopit 

lnl(y) 



lim rp(y) 

y— >oo 



lim 

y— >oo 



\ny 



lim 

y^-co In y 



The result follows immediately since, for slowly varying I, ln/(y)/lny 
immediately from (21) after taking the limit as x — > oo. 



0. The converse follows 



Case (Hi) also follows directly from (21 ), since F being long-tailed is equivalent to the expressions 
in (21) converging to 0. 

Proof of Lemma [(3 Consider first (i). Let F(lnx) = c(x) exp ^p-dt, with c(x) — > c > and 
e(x) — > 0, as x — > oo. Since F is long-tailed, rp(x) — > as x — > oo. Therefore, 

rpQxix) c'(c) e{x) 



c(x) 



X 
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This then implies that rpflnx) = xc , — eix) — > 0, and therefore, xc' (x) I c(x) — > 0. Writing 

cyx) 

£*(t) = e(t) + tc'{t)/c(t) it follows that F(lnx) = exp{J x ^-jp-dt} and the result follows since then 
ri?(lnx) = —e*(x). 

The proof of (ii) is similar to that of (i) after writing, for short-tailed F, 



— r x z(t) 

F(ln x) = c{x) exp{ / -^dt} 



with c(x)— >c>0 and z(t)— >— oo, as x— >oo, and recalling, from Lemma ([5]), that rp(x) — > oo. 
To prove (Hi), note that (17) holds if and only if 

(22) - lnF(y) + 5lnF(lny) — > oo, as y — > oo. 
Writing -lnF(y) + <51nF(lny) = - lni%)(l - 

lnF(y) 

and then noticing that 

(23) lim lnF i nj/) ) = lim = lim 4^ = ^ 



y^-cxD lnF(y) j/-s>oo yrp(y) y-+oa ye{e y ) y^oo yz(e y ) 

where the third and fourth terms in the string of identities correspond to the cases of short- and 
long-tails respectively. The result for medium-tailed distributions follows immediately from Lemma 
[5] since in that case r^(ln(y))/yr j p(y) — > 0, and for the short- and long-tailed distributions the 
results follow from the assumptions. 



Proof of Theorem 7| Since F is short-tailed X^ — Xr n _]\ ^4' and h(x) = — lnF(lnx) is 
regularly varying with index 7. That is, 

(24) h(x) = x~<l(x), 0<7<oo 

for some slowly varying function l{x). The case 7 = 00 represents the case where h(x) is rapidly 
varying. It follows that 

(25) F^(u) = ln/i _1 (-lnu) 



with h 1 regularly varying with index -. Thus, when 7 = 0, h 1 is rapidly varying. Under the 

^r-^krk " —> 1 and it is enough to consider the behavior of 

lnF(X (n) ) 



assumptions of Theorem 



-lnF(lnX (n) ) 

(26) hX ( „) W 



We prove that ( 26 ) converges to zero in probablity. The cases when 7 = or 00 follow immediately 
from properties of slowly and rapidly varying functions. The case of < 7 < 00 presents the most 
technical challenges and will be considered first. Recall that a positive function g defined on some 
neighborhood 0/00, varies smoothly with index ij £ R,g £ SR^, if H(x) = \ag(e x ) £ C°° with 
H'(x) — > rj, H^ n \x) — > 0, for n = 2, 3, . . . as x — >• 00. 

The following theorem, (see Bingham, Goldie, and Teugels [I]) will allow us to assume, without 
loss of generality, that h(x) is smoothly varying, so that, as a consequence, lim^oo h'(t)t/h(t) = 7. 
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Theorem Let g £ R v . Then there exists gi,g 2 £ wzt/i gi ~ 02 and g\ < 5 < 52 on some 

neighborhood 0/00. in 'particular, if g € i?r?, i/iere exists g* 6 .Si?^ iwi/t g* ~ (7. 

Let then < 7 < 00 and let ?7m , • • • , f/( n ) represent the order statistics from a uniform distribu- 
tion on (0, 1). Since = F _1 (l - C7 (n) ) £ F _1 (t/ (1) ) an d = F~\l - U^ x) ) = F~ l (U {2) ), 



D 



where = denotes equality in distribution, expression (26) has the same distribution as 



(27) 



lnF (C/ (1) ) 



where (27) follows from (24) and the fact that h(x) = —lnF(lnx). Using a one-step Taylor's 
expansion, we get 

F~\u {1) )-F~\u {2) ) = hxh-^-hxU^-hxh-^-hxU^) 

U{2) ~ U {1) 



Cnh-^-ln^h'ih-^-ln^))' 



U {1) <£ n < U(2). 



Therefore, (27) is bounded above by 



(F \u {1) ))n(F \u {1) ))U (2) -U {1) 



(28) 
Since 



lnF~\u {1) ) U (1) h-\-\^ n )h'{h-\-l^ n )) 

(Inh^i-lnU^njlnh^i-lnU^)) U {2) - U {1) 
lnln/ l - 1 (-lnC/ (1) ))/i- 1 (-ln^)/i , (/i- 1 (-ln^n)) U (l) 



U W ~ U W D _1 _ 

TT t r 1 



u. 



(1) 



V 



where V ~ f7 (0, 1), and since lnln/i (— lnC/m) — S- 00 a.s., while, as a consequence of Theorem 



10 



(29) 

then, to show that 

it is enough to show that 
(30) 



h(h-\-lnU)) 
-lnF(lnX (n) ) 



7 > 



lnX 



(») 



i X {n) - X {n~l)) ~* 



(In hT 1 ( - In U { 1} ) )T / (In h~ 1 ( - In U { x) ) ) 



To verify (29), write /i(x) = — ln-F(lna:) so that h'{x) 



rp(hix) 



follows that, after setting t = F 



0. 



and h~ l (t) = exp{F (e~*), } it 



d 



, ln(-lnF(t))l ^-i, c , = -f ln/t(e*). 
lnF(i) (it * =F dt y ' 
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Thus, (29) follows from Theorem 10 with r/ = 7. To prove (30) rewrite as 

(ln/i- 1 (-lnC/ (1) ))T/(ln/i- 1 (-ln?7 (1) )) -lnU {1) 
- In - ln(f n ) 

which is bounded above by 

(In In U (1) )yi(ln h~ l {- In U {1) )) - In U {1) 



■hxU {1) 



InU, 



(2) 



Now observe that > 2) = o(l) and in fact, P ( ^thtt^ > 3 i.o.) = 0. Therefore, writing 



hxU, 



>(2) 



(1)' 



(lnh-^tn^lOllh-^tn)) a 



since h 1 is Ri so that ln/i 1 is slowly varying, and hence l(lnh 1 (t n )) and (ln/i 1 (t n )) 7 are slowly 
7 

varying. It follows that (30) is true since l(x)/x —> for slowly varying I. 

■lnZ7 ( l) 



Consider now the case of 7 = 0. Since 



■ In £„ 



1 in probability, it follows from (28) that to 



show that (26) converges to zero in probability, it is enough to show that 

fcfln/r^- In 



(31) 



In In h~ 1 (- In U {1) ) h~ 1 (- In U {1) ) /»' 1 (- In U {1) ) ) 



and this follows directly, after writing y = ln/i 1 (— lnC/m), from the assumptions in the case of 
7 = 0, since in this case, rp(x) = e x h\e x ) and — In F (In x)J Ins ~ r^(lnx). 

Finally, consider the case of 7 = 00. That is, suppose that — ln.F(lnx) is rapidly varying. The 
condition given by (31) is seen to be equivalent to 

h(x) 



(32) 



e x h'(e x ) lnx 



0, as x — > 00. 



Recall that a rapidly varying function h(x) may be written as c(x)exp(f* ^p-dt), with c{x) — > c 
and z(t) > 0, z(t) — > 00, as t — > 00. Assuming without loss of generality that c(x) = c, it is clear 
that e x h'(e x ) = c* z(e x )h(e x ), and hence, 



(33) 



h(x) h{x) 
e x h'(e x )lnx z(e x )h(e x )lnx 



Since z{x) — > 00, as x — > 00, while h is non-decreasing, it follows that (32) holds. 



Proof of Theorem |8j For long-tailed F, F(lnx) = L{x) for some slowly varying L. Therefore, 
ln.F(ln:r) = — lnL(x) and hence r^(lnx) = —xL'(x)/L(x), where L{x) = c(x) exp( ^— di). 



Under the conditions of Theorem 4 consider 



-lnF(lnX (u) ) 



lnX 



instead of 



-lnF n (lnX ( , ) 



lnX 



and write 



■lnF(lnX (n) ) _ -lnF(\nX {n _ 1} ) , r r F (ln^) , lnFQn&O- 



lnX 



(n) 



lnX 
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for some £ n with Xr n -i) < Cn < X^ n y Note that, almost surely, as n — > oo 

lnF(lng n ) -r F (ln£ n ) 
U^Cn) 2 ~ U21n£„ + (ln£„) 2 )' 

Therefore, almost surely, for sufficiently large n, 

In i^(ln . . 

n - — hT^ — ; ^ W ~ x ( n - 1 )> 

m A( n _i) 

Since i* 1 = ln(L _1 (n)), we can write, using a one-step Taylor's expansion, 



lnF(lnXwn) -\aL(X (n _^) 
ln l ( " 1); (^(n)-^(n-i)) = ^ ^ (lnL^^^-lnL-H^))) 

-lnLClnL- 1 ^))) - t/ (2) ) 



InlnL-HE/p)) V (L-\ijj n ))L-^ n ) ' 

where Un\,U(2) are the first and second order statistics from a [/ ~ (0,1) random sample, and 
^(1) < V'n < t^(2)- Writing JTm — C7> 2 ) = —(1 — V)E/>2), where V ~ £/(0, 1) with V independent of 
Up), and since ip n < Um, and L'(x) < 0, it follows that T n is at least as large as 

(34) -lnLClnL- 1 ^))) (1 - V)LL~ 1 (i/) n ) 



lnlnL- 1 ^)) -L'(L-^^ n ))L-^ n y 



Since rp is eventually decreasing, —xL'(x)/L(x) is eventually decreasing and since L l (ip n ) > 
L~ l (U(2)), it follows that 



and hence, (34) implies that, almost surely, for large n, after setting Y n = L 1 (C r ( 2 )), 

rp > -lnL(lny n )(l-F)L(y n ) 

1 j n - im n y n -L'(y n )y n 

with V independent of Y n , and y n — > oo almost surely. Now 

L(y) 1 -lnL(lny) -L'(lny)lny 

while — - — ~ — r = r F {hi\ny). 



-L'(y)y r F (\ny) In my LQny) 



Therefore, it follows that the right side of (35) is asymptotically equivalent, almost surely, to 

r F (lnlxiY n ) 



(i-vy 



r F (lnY n ) 



The result then follows from the assumption that r F (y) = o(r F (lny). 

Proof of Theorem [9| As in the proof of the previous theorem, we can write, almost surely, for 
sufficiently large n 

- In F (hi X( n _y) 

(36) T n > — (X( n \ ~ -X"( n -i))- 
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Since F (u) = L l (u), we can write, using a one-step Taylor's expansion, 

-lnFOnX^!)) 
(37) — - (X {n) - X (n _ l} ) 



(38) = -\nL{\nL-\U {2) ))(U {l) -U (2) 



(39) 



\xxL-\U {2) ) L'{L-\^ n )Y 

\nL{\nL-\U {2) )) {l-V)U {2) 
lnL-i(t/ {2) ) -L'(L-^ n )Y 



where Un\,UM\ are the first and second order statistics from a U ~ (0,1) random sample, and 
^(i) < V'n < ^(2)> with V independent of £^(2)- 

But, since L~ l is decreasing and, therefore, L _1 (V>n) > 1 (^ r (2))i an d —L'/L is eventually 
decreasing, then 

^(2) > ^(L- 1 ^)) > L{L-\U {2) )) 



-L'iL-^iPn) ~ -L'{L- l {i> n ) ~ -L'{L- x U m y 
It follows from (37) that, after setting Y n = L _1 (C7( 2 )) 



-lnFqnX^x)) - lnL(lny ra )L(r ra ) -lnL(l n y n ) 

lnX(„-i) (n ~ 1)j " -lnr„L'(y n ) _ ln y n( ^g)_^)) 

-lnL(lny n )y n 

- i n y n (-e(y n )) 

since — e(y ra ) — >• a > and ~ lni ^, y ") yn — >. oq, almost surely. 
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